Be Kind, Rewind: Checkpoint & Restore Capability for Improving Reliability of Large-Scale Semiconductor Design

نویسندگان

  • Igor Ljubuncic
  • Ravi Giri
  • Andrew Goldis
  • Avikam Rozenfeld
چکیده

Intel’s chip design run in a large-scale globally distributed environment with 600,000 cores. In the current semiconductor market scenario, a combination of factors such as time to market pressure, explosive growth in the mobile market segment and upcoming new markets has led to a significant increase in the demand for and reliability of computing resources. Checkpointing is a capability that can make a significant improvement in improving reliability, however, there is no mature solution that allows periodic snapshots of running compute jobs for replay them at a later time in a consistent manner in a large scale environment. Intel IT has partnered with the Northeastern University (NEU) Distributed Multi-Threaded Checkpointing (DMTCP) team to improve their checkpoint & restore solution for the design computing environment. This paper elaborates on the innovative technological breakthroughs, industry-academy partnership as well as the open-source contribution. Keywords—Intel; Information Technology; Engineering Computing; Checkpoint & Restore; Checkpointing; Distributed MultiThreaded Checkpointing; DMTCP; CPU design

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

THE OPTIMIZATION OF LARGE-SCALE DOME TRUSSES ON THE BASIS OF THE PROBABILITY OF FAILURE

Metaheuristic algorithms are preferred by the many researchers to reach the reliability based design optimization (RBDO) of truss structures. The cross-sectional area of the elements of a truss is considered as design variables for the size optimization under frequency constraints. The design of dome truss structures are optimized based on reliability by a popular metaheuristic optimization tec...

متن کامل

IMPROVED BAT ALGORITHM FOR OPTIMUM DESIGN OF LARGE-SCALE TRUSS STRUCTURES

Deterring the optimum design of large-scale structures is a difficult task. Great number of design variables, largeness of the search space and controlling great number of design constraints are major preventive factors in performing optimum design of large-scale truss structures in a reasonable time. Meta-heuristic algorithms are known as one of the useful tools to d...

متن کامل

A Variable Structure Observer Based Control Design for a Class of Large scale MIMO Nonlinear Systems

This paper fully discusses how to design an observer based decentralized fuzzy adaptive controller for a class of large scale multivariable non-canonical nonlinear systems with unknown functions of subsystems’ states. On-line tuning mechanisms to adjust both the parameters of the direct adaptive controller and observer that guarantee the ultimately boundedness of both the tracking error and tha...

متن کامل

Design and fabrication of three axes accelerometer

Technology can provide safety in physical practices and improving the performance of these activities, so manufacturing of equipments for these purposes has been considered widely. The aim of the present paper was design and fabrication of three-axis accelerometer. hree acceleration sensors, three gyroscopes for angular velocity measurement, a microcontroller for converting analog data to ...

متن کامل

The Generation of Earthquake PGA Using Stochastic Finite Fault Method in Alborz Region

Time-history analysis is defined as a kind of dynamic analysis increasingly used in design of structures and evaluation of existing ones. One of the important issues in the Time-history analysis is selecting earthquake records. In this case, seismic design provisions states that time histories shall have similar source mechanisms, geological and seismological features with region under study. A...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2014